AITopics | habib hajimolahoseini

Collaborating Authors

habib hajimolahoseini

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating the Low-Rank Decomposed Models

Hajimolahoseini, Habib, Ahmed, Walid, Wen, Austin, Liu, Yang

arXiv.org Artificial IntelligenceJul-24-2024

Tensor decomposition is a mathematically supported technique for data compression. It consists of applying some kind of a Low Rank Decomposition technique on the tensors or matrices in order to reduce the redundancy of the data. However, it is not a popular technique for compressing the AI models duo to the high number of new layers added to the architecture after decomposition. Although the number of parameters could shrink significantly, it could result in the model be more than twice deeper which could add some latency to the training or inference. In this paper, we present a comprehensive study about how to modify low rank decomposition technique in AI models so that we could benefit from both high accuracy and low memory consumption as well as speeding up the training and inference

arxiv preprint arxiv, convolutional layer, decomposition, (13 more...)

arXiv.org Artificial Intelligence

2407.20266

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?

Hajimolahoseini, Habib, Ahmed, Walid, Wen, Austin, Liu, Yang

arXiv.org Artificial IntelligenceJul-23-2024

In this paper, we present a comprehensive study and propose several novel techniques for implementing 3D convolutional blocks using 2D and/or 1D convolutions with only 4D and/or 3D tensors. Our motivation is that 3D convolutions with 5D tensors are computationally very expensive and they may not be supported by some of the edge devices used in real-time applications such as robots. The existing approaches mitigate this by splitting the 3D kernels into spatial and temporal domains, but they still use 3D convolutions with 5D tensors in their implementations. We resolve this issue by introducing some appropriate 4D/3D tensor reshaping as well as new combination techniques for spatial and temporal splits. The proposed implementation methods show significant improvement both in terms of efficiency and accuracy. The experimental results confirm that the proposed spatio-temporal processing structure outperforms the original model in terms of speed and accuracy using only 4D tensors with fewer parameters.

arxiv preprint arxiv, convolution, tensor, (12 more...)

arXiv.org Artificial Intelligence

2407.16514

Country: North America > Canada > Ontario > Toronto (0.04)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)

Add feedback

GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values

Javadi, Farnoosh, Ahmed, Walid, Hajimolahoseini, Habib, Ataiefard, Foozhan, Hassanpour, Mohammad, Asani, Saina, Wen, Austin, Awad, Omar Mohamed, Liu, Kangling, Liu, Yang

arXiv.org Artificial IntelligenceDec-13-2023

Massive transformer-based models face several challenges, including slow and computationally intensive pre-training and over-parametrization. This paper addresses these challenges by proposing a versatile method called GQKVA, which generalizes query, key, and value grouping techniques. GQKVA is designed to speed up transformer pre-training while reducing the model size. Our experiments with various GQKVA variants highlight a clear trade-off between performance and model size, allowing for customized choices based on resource and time limitations. Our findings also indicate that the conventional multi-head attention approach is not always the best choice, as there are lighter and faster alternatives available. We tested our method on ViT, which achieved an approximate 0.3% increase in accuracy while reducing the model size by about 4% in the task of image classification. Additionally, our most aggressive model reduction experiment resulted in a reduction of approximately 15% in model size, with only around a 1% drop in accuracy.

matrix, model size, transformer, (14 more...)

arXiv.org Artificial Intelligence

2311.03426

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

Hajimolahoseini, Habib, Awad, Omar Mohamed, Ahmed, Walid, Wen, Austin, Asani, Saina, Hassanpour, Mohammad, Javadi, Farnoosh, Ahmadi, Mehdi, Ataiefard, Foozhan, Liu, Kangling, Liu, Yang

arXiv.org Artificial IntelligenceNov-25-2023

In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance criteria measured over the entire dataset during warm-up stages, aiming to preserve the model performance with fewer examples during the rest of training. The importance measure we propose could be updated during training every once in a while, to make sure that all of the data samples have a chance to return to the training loop if they show a higher importance. The model architecture is unchanged but since the number of data samples controls the number of forward and backward passes during training, we can reduce the training time by reducing the number of training samples used in each epoch of training. Experimental results on a variety of CV and NLP models during both pretraining and finetuning show that the model performance could be preserved while achieving a significant speed-up during training. More specifically, BERT finetuning on GLUE benchmark shows that almost 90% of the data can be dropped achieving an end-to-end average speedup of 3.36x while keeping the average accuracy drop less than 0.92%.

data sample, dataset, habib hajimolahoseini, (12 more...)

arXiv.org Artificial Intelligence

2311.15134

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Resnet-9 Generalization Trained on Small Datasets

Awad, Omar Mohamed, Hajimolahoseini, Habib, Lim, Michael, Gosal, Gurpreet, Ahmed, Walid, Liu, Yang, Deng, Gordon

arXiv.org Artificial IntelligenceSep-7-2023

This paper presents our proposed approach that won the first prize at the ICLR competition "Hardware Aware Efficient Training". The challenge is to achieve the highest possible accuracy in an image classification task in less than 10 minutes. The training is done on a small dataset of 5000 images picked randomly from CIFAR-10 dataset. The evaluation is performed by the competition organizers on a secret dataset with 1000 images of the same size. Our approach includes applying a series of technique for improving the generalization of ResNet-9 including: sharpness aware optimization, label smoothing, gradient centralization, input patch whitening as well as meta-learning based training.

dataset, generalization, habib hajimolahoseini, (11 more...)

arXiv.org Artificial Intelligence

2309.03965

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Ontario > Kingston (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization

Hajimolahoseini, Habib, Ahmed, Walid, Liu, Yang

arXiv.org Artificial IntelligenceSep-7-2023

Low Rank Decomposition (LRD) is a model compression technique applied to the weight tensors of deep learning models in order to reduce the number of trainable parameters and computational complexity. However, due to high number of new layers added to the architecture after applying LRD, it may not lead to a high training/inference acceleration if the decomposition ranks are not small enough. The issue is that using small ranks increases the risk of significant accuracy drop after decomposition. In this paper, we propose two techniques for accelerating low rank decomposed models without requiring to use small ranks for decomposition. These methods include rank optimization and sequential freezing of decomposed layers. We perform experiments on both convolutional and transformer-based models. Experiments show that these techniques can improve the model throughput up to 60% during training and 37% during inference when combined together while preserving the accuracy close to that of the original models.

convolutional layer, decomposed layer, freezing, (14 more...)

arXiv.org Artificial Intelligence

2309.03824

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Ontario > Kingston (0.04)

Genre: Research Report (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback